--- title: first snkrfinder.model a keywords: fastai sidebar: home_sidebar nb_path: "nbs/02_model.ipynb" ---
print(Path().cwd())
os.chdir(L_ROOT)
print(Path().cwd())
filename = ZAPPOS_DF_SIMPLIFIED # "zappos-50k-simplified"
df = pd.read_pickle(f"data/{filename}.pkl")
Because we simply want to collect the features output from the model rather than do classification (or some other decision) I replaced the clasiffier head with a simple identity mapper. The simple Identity nn.Module class makes this simple.
Finally, since we are calculating the features, or embedding over 30k images with the net lets load the computations onto our GPU. We need to remember to do this in evaluation mode so Batch Norm / Dropout layers are disabled. [I forgot to do this initally and lost hours trying to figure out why i wasn't getting consistent results]. Setting param.requires_grad = False saves us memory since we aren't going to fit any weights for now, and protects us in case we forget to do a with torch.no_grad() before inference.
Later when we use the full FastAI API this should all be handled elegantly behind the scenes
This function neuters the mobilenet v2 and pools the output with avg/max across the spatial dimension using torch calls.
This function neuters the fastai resnet18 and pools the output with avg/max across the spatial dimension using low-level fastai api torch wrappers.
We should probably set up our dataloaders to load all of the data into the 'valid'. Then we don't need to worry about the Resize() function doing random stuff.
This creates the right indices to get an empty dls.train and all images in dls.valid, but its not happy creating an empty train dls...
IndexSplitter(df.index.values.tolist())(df),df.shape
Turns out this feature of Resize results in a random crop for training contexts, and is a bit of a bug for what I'm doing here. The rest of the special features of the fastai Resize has compared to coding it with torch makes it worth these hacks. For now I'll subclass a FeatsResize and replace the before_call() method which performs the split_idx voodoo.
sz=IMG_SIZES['small']
device = get_cuda()
batch_size = 64
dls = get_zap_feats_dataloaders(df,batch_size,sz,device)
#model = get_mnetV2_feature_net(to_cuda=True)
df_f = get_feats_df(dls,model)
df_f.head()
IMG_SIZES
# f = open(filepath, "wb")
# pickle.dump(item_to_save, f)
# f.close()
# def load_pickle(filepath):
# infile = open(filepath, "rb")
# item = pickle.load(infile)
# infile.close()
# return item
model = create_cnn_featurenet(torchvision.models.mobilenet_v2,to_cuda=True)
save_featsXsize(df,model)
mnet_df = collate_featsXsize(df,model.name)
model = create_cnn_featurenet(resnet18,to_cuda=True)
save_featsXsize(df,model)
rnet_df = collate_featsXsize(df,model.name)
If we've already calculated everything just load it.
# query_image = "Shoes/Sneakers and Athletic Shoes/Nike/7716996.288224.jpg"
df.loc[df.path==QUERY_IM,['path','classes_md']]
The DataBlock performed a number of processing steps to prepare the images for embedding into the MobileNet_v2 space (1280 vector). Lets confirm that we get the same image and MobileNet_v2 features.
First, lets wrap our model in a function which will make sure we are in inference mode and things are sent to the cpu.
Now, we need to transform our inference image into a tensore properly sized and normalized for the model. Something link this:
That works okay, and just wrapping it in a simple function makes it pretty simple...
test_feats = get_mnet_feature(mnetv2,query_t)
test_feats.shape
Buuuuuut, a FastAI Pipeline is just a couple lines of code. Now we're cookin'. Note that Normalize.from_stats usually is executed on batches loaded into the GPU so we need to make sure it has the cuda=False flag set. I suppose we could be doing massively parallel inference and want this pipeline on a GPU at some point, but typically we'll want cpus powered transform.
mnet1 = get_mnetV2_feature_net()
query_t1 = load_and_prep_sneaker(images_path/QUERY_IM)
test_feats1 = get_mnet_feature(mnet1,query_t1)
mnet2 = create_cnn_featurenet(torchvision.models.mobilenet_v2)
query_t2 = load_and_prep_tf_pipe(images_path/QUERY_IM)
test_feats2 = get_mnet_feature(mnet2,query_t2)
#test_feats1.mean(),test_feats2.mean(),(test_feats1-test_feats2).max(),
#PILImage.create((query_t1-query_t2).squeeze())
qt = load_and_prep_tf_pipe(images_path/QUERY_IM)
test_feats2 = get_mnet_feature(mnetv2,qt)
test_feats2.shape,
Now I have the "embeddings" of the database in the mobileNet_v2 output space. I can do a logistic regression on these vectors (should be identical to mapping these 1000 vectors to 4 categories (Part 3)) but I can also use an approximate KNN in this space to run the SneakerFinder tool.
I'll start with a simple "gut" test, and point out that thre realy isn't a ground truth to refer to. Remember that the goal of all this is to find some shoes that someone will like, and we are using "similar" as the aproximation of human preference.
Lets use our previously calculated sneaker-features and inspect that the k- nearest neighbors in our embedding space are feel or look "similar".
Personally, I like Jordans so I chose this as my query_image: 
Lets take a quick look at the neighbors according to our list:
num_neighs = 9
knns, reducers = get_neighs_and_reducers(df,num_neighs=num_neighs)
neighs = knns[0]
distance, nn_index = neighs.kneighbors(test_feats, return_distance=True)
dist = distance.tolist()[0]
filename = f"data/{model.name}-knnXsize_nn{num_neighs}.pkl"
dump_pickle(filename,knns)
filename = f"data/{model.name}-umapXsize.pkl"
dump_pickle(filename,reducers)
num_neighs = 9
model = create_cnn_featurenet(resnet18,to_cuda=True)
filename = f"data/knnXsize_nn{num_neighs}.pkl"
knns = load_pickle(filename)
filename = f"data/umapXsize.pkl"
reducers = load_pickle(filename)
paths_df = df[['path','classes_sm','classes_md','classes_lg']]
neighbors = paths_df.iloc[nn_index.tolist()[0]].copy()
query_t2 = load_and_prep_tf_pipe(images_path/QUERY_IM)
test_feats2 = get_mnet_feature(model,query_t2)
neighs = knns[0]
distance, nn_index = neighs.kneighbors(test_feats2, return_distance=True)
dist = distance.tolist()[0]
images = [ PILImage.create(images_path/f) for f in neighbors.path]
#PILImage.create(btn_upload.data[-1])
for im in images:
display(im.to_thumb(IMG_SIZE,IMG_SIZE))
similar_images = get_similar_images( paths_df,model,knns)
plot_sneak_neighs(similar_images)
similar_images2 = []
for i,sz in enumerate(IMG_SIZES):
print(SIZE_ABBR[sz])
print(IMG_SIZES[sz])
features = f"features_{SIZE_ABBR[sz]}"
print(features)
query_t = load_and_prep_sneaker(QUERY_IM2,IMG_SIZES[sz])
query_f = get_mnet_feature(mnetv2,query_t)
similar_images2.append( query_neighs(query_f, knns[i], paths, images_path, show=False) )
im = PILImage.create(QUERY_IM2)
display(im.to_thumb(IMG_SIZES[sz]))
plot_sneak_neighs(similar_images2)
# first simple PCA
pca = PCA(n_components=2)
for i,sz in enumerate(IMG_SIZES):
print(SIZE_ABBR[sz])
print(IMG_SIZES[sz])
features = f"features_{SIZE_ABBR[sz]}"
print(features)
data = df[['Category',features]].copy()
db_feats = np.vstack(data[features].values)
# PCA
pca_result = pca.fit_transform(db_feats)
data['pca-one'] = pca_result[:,0]
data['pca-two'] = pca_result[:,1]
print(f"Explained variation per principal component (sz{sz}): {pca.explained_variance_ratio_}")
smpl_fac=.5
#data=df.reindex(rndperm)
plt.figure(figsize=(16,10))
sns.scatterplot(
x="pca-one",
y="pca-two",
hue="Category",
palette=sns.color_palette("hls", 4),
data=data.sample(frac=smpl_fac),
legend="full",
alpha=0.3
)
plt.savefig(f'PCA 2-D sz{sz}')
plt.show()
# get the UMAP on deck
embedding = reducers[i].transform(db_feats)
data['umap-one'] = embedding[:,0]
data['umap-two'] = embedding[:,1]
plt.figure(figsize=(16,10))
sns.scatterplot(
x="umap-one",
y="umap-two",
hue="Category",
palette=sns.color_palette("hls", 4),
data=data.sample(frac=smpl_fac),
legend="full",
alpha=0.3
)
plt.gca().set_aspect('equal', 'datalim')
plt.title(f'UMAP projection of mobileNetV2 embedded UT-Zappos data (sz{sz})', fontsize=24)
plt.savefig('UMAP 2-D sz{sz}')
plt.show()
fn = df.path.values
type(db_feats)
snk2vec = dict(zip(fn,db_feats))
snk2vec[list(snk2vec.keys())[0]]
embedding = get_umap_embedding(db_feats)
snk2umap = dict(zip(fn,embedding))
filename = f"zappos-50k-{model.name}-features_sort_3"
df = pd.read_pickle(f"data/{filename}.pkl")
#Display Confusion Matrix
X_test = np.vstack(df[df.t_t_v=='test']['features_lg'])
y_test = np.vstack(df[df.t_t_v=='test']['Category']).flatten()
# use validate and train for training (no validation here)
X_train = np.vstack(df[df.train | df.validate]['features_lg'])
y_train = np.vstack(df[df.train | df.validate]['Category']).flatten()
clf_log = LogisticRegression(C = 1, multi_class='ovr', max_iter=2000, solver='lbfgs')
clf_log.fit(X_train, y_train)
log_score = clf_log.score(X_test, y_test)
log_ypred = clf_log.predict(X_test)
log_confusion_matrix = confusion_matrix(y_test, log_ypred)
print(log_confusion_matrix)
disp = heatmap(log_confusion_matrix, annot=True, linewidths=0.5, cmap='Blues')
plt.savefig('log_Matrix.png')
plt.figure(figsize=(16,16))
# Plot non-normalized confusion matrix
titles_options = [("Confusion matrix, without normalization", None),
("Normalized confusion matrix", 'true')]
class_names = df.Category.unique()
from sklearn.metrics import plot_confusion_matrix
for title, normalize in titles_options:
disp = plot_confusion_matrix(clf_log, X_test, y_test,
display_labels=class_names,
cmap=plt.cm.Blues,
normalize=normalize)
disp.ax_.set_title(title)
print(title)
print(disp.confusion_matrix)
plt.savefig('log_Matrix2.png')
Here's how we could would do transfer learning "by hand":
1. load the pretrained network
2. create a new linear classifier e.g. `nn.Linear(num_ftrs, n_categories)`
3. _freeze_ the parameters by setting `param.requires_grad=False` (NOTE, that to actually work we need to NOT freeze batchnorm layers)
4. create a training loop (or send to a fastai learner)
With the fastai API we can simply use cnn_learner with the name of the architecture, and everything else is semi-automatic. e.g.
1. load the arcitecture and trained weights
3. creating a classifier "head"
3. setting up the parameters for freezing (avoiding batchnorms)
Note that the mobilenet V2 architecture is NOT part of the API so we'll need to get the weights and arch from torchvision, and hack in the the splitter and cut points.
I'm also going to wrap the dataframe -> DataBlock -> dataloaders in some convenience functions to make the whole shebang just a few lines.
filename = ZAPPOS_DF_SIMPLIFIED # "zappos-50k-simplified"
df = pd.read_pickle(f"data/{filename}.pkl")
df = prep_df_for_datablocks(df)
df.shape[0]/32
#dls = get_zappos_cat_dataloaders(df)
dls2 = get_zappos_cat_dataloaders()
rnet_learn = cnn_learner(dls2, resnet18, metrics=error_rate)
dls.show_batch()
lr_min,lr_steep = rnet_learn.lr_find()
mlr = .5*(lr_min+lr_steep)
#geometric mean
gmlr = torch.tensor([lr_min,lr_steep]).log().mean().exp().tolist()
lr_min,lr_steep,mlr,gmlr
rnet_learn.fine_tune(2, base_lr=gmlr,freeze_epochs=1)
rnet_learn.show_results()
filename = 'rnet18_transfer-feb20_1x2b'
rnet_learn.save(filename)
rnet_learn.export(fname=filename)
freeze_epochs,epochs = 4,2
lr_min,lr_steep = rnet_learn.lr_find()
#geometric mean
gmlr = torch.tensor([lr_min,lr_steep]).log().mean().exp().tolist()
rnet_learn.fine_tune(epochs, base_lr=gmlr,freeze_epochs=freeze_epochs)
rnet_learn.show_results()
filename = f'rnet18_transfer-feb20_{freeze_epochs}x{epochs}b'
rnet_learn.save(filename)
rnet_learn.save('rnet18_transfer-fep20_1x2')
rnet_learn.export(fname=filename)
mnet_learn = cnn_learner(dls,torchvision.models.mobilenet_v2, n_out=4,
pretrained=True,metrics=error_rate)
lr_min,lr_steep = mnet_learn.lr_find()
mlr = .5*(lr_min+lr_steep)
#geometric mean
gmlr = torch.tensor([lr_min,lr_steep]).log().mean().exp().tolist()
lr_min,lr_steep,mlr,gmlr
freeze_epochs,epochs = 4,2
mnet_learn.fine_tune(epochs, base_lr=gmlr,freeze_epochs=freeze_epochs)
mnet_learn.show_results()
filename = f'mnet18_transfer-feb20_{freeze_epochs}x{epochs}b'
rnet_learn.save(filename)
rnet_learn.export(fname=filename)
freeze_epochs,epochs = 2,1
mnet_learn = cnn_learner(dls,torchvision.models.mobilenet_v2, n_out=4,
pretrained=True,metrics=error_rate)
lr_min,lr_steep = mnet_learn.lr_find()
gmlr = torch.tensor([lr_min,lr_steep]).log().mean().exp().tolist() #geometric mean
mnet_learn.fine_tune(epochs, base_lr=gmlr,freeze_epochs=freeze_epochs)
mnet_learn.show_results()
filename = f'mnet18_transfer-feb20_{freeze_epochs}x{epochs}b'
rnet_learn.save(filename)
rnet_learn.export(fname=filename)
interp = Interpretation.from_learner(mnet_learn)
interp.plot_top_losses(9, figsize=(15,10))
rnet_learn=load_learner('rnet_transfer-feb20_1x2a')
interp = Interpretation.from_learner(rnet_learn)
interp.plot_top_losses(9, figsize=(15,10))
interp.plot_top_losses()
from nbdev.export import notebook2script
notebook2script()